Overview

Dataset statistics

Number of variables20
Number of observations694409
Missing cells1712237
Missing cells (%)12.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory106.0 MiB
Average record size in memory160.0 B

Variable types

Categorical10
Numeric7
Unsupported2
Boolean1

Warnings

State has constant value "Niger" Constant
AFTERNOON has constant value "0" Constant
ADR_IDS has constant value "6,1" Constant
Regimen has a high cardinality: 90 distinct values High cardinality
PHARMACY_ID is highly correlated with PATIENT_IDHigh correlation
PATIENT_ID is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
FACILITY_ID is highly correlated with PATIENT_ID and 2 other fieldsHigh correlation
EVENING is highly correlated with FACILITY_ID and 1 other fieldsHigh correlation
ADHERENCE is highly correlated with FACILITY_ID and 1 other fieldsHigh correlation
PHARMACY_ID is highly correlated with PATIENT_ID and 1 other fieldsHigh correlation
PATIENT_ID is highly correlated with PHARMACY_IDHigh correlation
FACILITY_ID is highly correlated with ADHERENCEHigh correlation
EVENING is highly correlated with PHARMACY_ID and 1 other fieldsHigh correlation
ADHERENCE is highly correlated with FACILITY_ID and 1 other fieldsHigh correlation
PHARMACY_ID is highly correlated with PATIENT_IDHigh correlation
PATIENT_ID is highly correlated with PHARMACY_IDHigh correlation
EVENING is highly correlated with ADHERENCEHigh correlation
ADHERENCE is highly correlated with EVENINGHigh correlation
Facility Name is highly correlated with PATIENT_ID and 5 other fieldsHigh correlation
Regimen Line is highly correlated with RegimenHigh correlation
PATIENT_ID is highly correlated with Facility Name and 5 other fieldsHigh correlation
L.G.A is highly correlated with Facility Name and 5 other fieldsHigh correlation
DMOC_TYPE is highly correlated with PHARMACY_IDHigh correlation
Regimen is highly correlated with Facility Name and 5 other fieldsHigh correlation
PHARMACY_ID is highly correlated with Facility Name and 5 other fieldsHigh correlation
FACILITY_ID is highly correlated with Facility Name and 2 other fieldsHigh correlation
ADHERENCE is highly correlated with Facility Name and 4 other fieldsHigh correlation
State is highly correlated with DMOC_TYPE and 9 other fieldsHigh correlation
DMOC_TYPE is highly correlated with State and 3 other fieldsHigh correlation
Regimen is highly correlated with State and 4 other fieldsHigh correlation
Facility Name is highly correlated with State and 4 other fieldsHigh correlation
Regimen Line is highly correlated with State and 3 other fieldsHigh correlation
ADR_SCREENED is highly correlated with State and 2 other fieldsHigh correlation
AFTERNOON is highly correlated with State and 9 other fieldsHigh correlation
ADR_IDS is highly correlated with State and 9 other fieldsHigh correlation
PRESCRIP_ERROR is highly correlated with State and 2 other fieldsHigh correlation
L.G.A is highly correlated with State and 4 other fieldsHigh correlation
ADHERENCE is highly correlated with State and 6 other fieldsHigh correlation
ADR_SCREENED has 502740 (72.4%) missing values Missing
ADR_IDS has 694406 (> 99.9%) missing values Missing
DMOC_TYPE has 515067 (74.2%) missing values Missing
DURATION is highly skewed (γ1 = 65.07331267) Skewed
MORNING is highly skewed (γ1 = 234.8112474) Skewed
BODY_WEIGHT is highly skewed (γ1 = 20.88914214) Skewed
PHARMACY_ID has unique values Unique
DATE_VISIT is an unsupported type, check if it needs cleaning or further analysis Unsupported
NEXT_APPOINTMENT is an unsupported type, check if it needs cleaning or further analysis Unsupported
MORNING has 561743 (80.9%) zeros Zeros
EVENING has 547892 (78.9%) zeros Zeros
BODY_WEIGHT has 691674 (99.6%) zeros Zeros

Reproduction

Analysis started2021-06-15 09:01:38.570413
Analysis finished2021-06-15 09:03:12.291346
Duration1 minute and 33.72 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

State
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
Niger
694409 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters3472045
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNiger
2nd rowNiger
3rd rowNiger
4th rowNiger
5th rowNiger

Common Values

ValueCountFrequency (%)
Niger694409
100.0%

Length

2021-06-15T09:03:12.551922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:03:12.671252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
niger694409
100.0%

Most occurring characters

ValueCountFrequency (%)
N694409
20.0%
i694409
20.0%
g694409
20.0%
e694409
20.0%
r694409
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2777636
80.0%
Uppercase Letter694409
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i694409
25.0%
g694409
25.0%
e694409
25.0%
r694409
25.0%
Uppercase Letter
ValueCountFrequency (%)
N694409
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3472045
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N694409
20.0%
i694409
20.0%
g694409
20.0%
e694409
20.0%
r694409
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3472045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N694409
20.0%
i694409
20.0%
g694409
20.0%
e694409
20.0%
r694409
20.0%

L.G.A
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
Bida
196055 
Kontagora
93113 
Lapai
76673 
Borgu
72502 
Rafi
55176 
Other values (9)
200890 

Length

Max length9
Median length5
Mean length5.465787454
Min length4

Characters and Unicode

Total characters3795492
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMagama
2nd rowMagama
3rd rowMagama
4th rowMagama
5th rowMagama

Common Values

ValueCountFrequency (%)
Bida196055
28.2%
Kontagora93113
13.4%
Lapai76673
 
11.0%
Borgu72502
 
10.4%
Rafi55176
 
7.9%
Mokwa49129
 
7.1%
Rijau43269
 
6.2%
Shiroro38560
 
5.6%
Wushishi34446
 
5.0%
Magama16402
 
2.4%
Other values (4)19084
 
2.7%

Length

2021-06-15T09:03:12.942968image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bida196055
28.2%
kontagora93113
13.4%
lapai76673
 
11.0%
borgu72502
 
10.4%
rafi55176
 
7.9%
mokwa49129
 
7.1%
rijau43269
 
6.2%
shiroro38560
 
5.6%
wushishi34446
 
5.0%
magama16402
 
2.4%
Other values (4)19084
 
2.7%

Most occurring characters

ValueCountFrequency (%)
a754031
19.9%
i481165
12.7%
o384977
10.1%
B268557
 
7.1%
r245275
 
6.5%
d196055
 
5.2%
g185970
 
4.9%
u166761
 
4.4%
h108865
 
2.9%
n108244
 
2.9%
Other values (17)895592
23.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3101083
81.7%
Uppercase Letter694409
 
18.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a754031
24.3%
i481165
15.5%
o384977
12.4%
r245275
 
7.9%
d196055
 
6.3%
g185970
 
6.0%
u166761
 
5.4%
h108865
 
3.5%
n108244
 
3.5%
t93113
 
3.0%
Other values (10)376627
12.1%
Uppercase Letter
ValueCountFrequency (%)
B268557
38.7%
R98445
 
14.2%
K93113
 
13.4%
L89864
 
12.9%
M71424
 
10.3%
S38560
 
5.6%
W34446
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3795492
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a754031
19.9%
i481165
12.7%
o384977
10.1%
B268557
 
7.1%
r245275
 
6.5%
d196055
 
5.2%
g185970
 
4.9%
u166761
 
4.4%
h108865
 
2.9%
n108244
 
2.9%
Other values (17)895592
23.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3795492
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a754031
19.9%
i481165
12.7%
o384977
10.1%
B268557
 
7.1%
r245275
 
6.5%
d196055
 
5.2%
g185970
 
4.9%
u166761
 
4.4%
h108865
 
2.9%
n108244
 
2.9%
Other values (17)895592
23.6%

Facility Name
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
Federal Medical Centre - Bida
110430 
General Hospital Kontagora
87583 
General Hospital -Bida
85625 
General Hospital - Lapai
71906 
General Hospital - New Bussa
69864 
Other values (15)
269001 

Length

Max length43
Median length26
Mean length23.54726105
Min length8

Characters and Unicode

Total characters16351430
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRural Hosp- Auna
2nd rowRural Hosp- Auna
3rd rowRural Hosp- Auna
4th rowRural Hosp- Auna
5th rowRural Hosp- Auna

Common Values

ValueCountFrequency (%)
Federal Medical Centre - Bida110430
15.9%
General Hospital Kontagora87583
12.6%
General Hospital -Bida85625
12.3%
General Hospital - Lapai71906
10.4%
General Hospital - New Bussa69864
10.1%
General Hospital - Kagara55176
7.9%
G. Hosp Mokwa49129
7.1%
General Hospital T Magajiya43269
 
6.2%
Rural Hosp38560
 
5.6%
CHC Zungeru33081
 
4.8%
Other values (10)49786
7.2%

Length

2021-06-15T09:03:13.269360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
general444115
17.1%
hospital432864
16.7%
307376
11.8%
bida196055
 
7.6%
centre117373
 
4.5%
medical115960
 
4.5%
federal110430
 
4.3%
hosp103686
 
4.0%
kontagora95653
 
3.7%
lapai71906
 
2.8%
Other values (30)600731
23.1%

Most occurring characters

ValueCountFrequency (%)
a2158388
13.2%
1901740
 
11.6%
e1593001
 
9.7%
l1159798
 
7.1%
i937002
 
5.7%
r918334
 
5.6%
o793359
 
4.9%
n696743
 
4.3%
s696509
 
4.3%
t673685
 
4.1%
Other values (31)4822871
29.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter11625314
71.1%
Uppercase Letter2375275
 
14.5%
Space Separator1901740
 
11.6%
Dash Punctuation399972
 
2.4%
Other Punctuation49129
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a2158388
18.6%
e1593001
13.7%
l1159798
10.0%
i937002
8.1%
r918334
7.9%
o793359
 
6.8%
n696743
 
6.0%
s696509
 
6.0%
t673685
 
5.8%
p608456
 
5.2%
Other values (11)1390039
12.0%
Uppercase Letter
ValueCountFrequency (%)
H578449
24.4%
G498011
21.0%
B271097
11.4%
M215951
 
9.1%
C190940
 
8.0%
K177211
 
7.5%
F110430
 
4.6%
N84825
 
3.6%
L71906
 
3.0%
R43306
 
1.8%
Other values (7)133149
 
5.6%
Space Separator
ValueCountFrequency (%)
1901740
100.0%
Dash Punctuation
ValueCountFrequency (%)
-399972
100.0%
Other Punctuation
ValueCountFrequency (%)
.49129
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin14000589
85.6%
Common2350841
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a2158388
15.4%
e1593001
11.4%
l1159798
 
8.3%
i937002
 
6.7%
r918334
 
6.6%
o793359
 
5.7%
n696743
 
5.0%
s696509
 
5.0%
t673685
 
4.8%
p608456
 
4.3%
Other values (28)3765314
26.9%
Common
ValueCountFrequency (%)
1901740
80.9%
-399972
 
17.0%
.49129
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII16351430
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a2158388
13.2%
1901740
 
11.6%
e1593001
 
9.7%
l1159798
 
7.1%
i937002
 
5.7%
r918334
 
5.6%
o793359
 
4.9%
n696743
 
4.3%
s696509
 
4.3%
t673685
 
4.1%
Other values (31)4822871
29.5%

Regimen Line
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
ART First Line Adult
598437 
Cotrimoxazole (CTX) Prophylaxis
 
47727
Isoniazid Preventive Therapy (IPT)
 
18706
ART First Line Children
 
12906
ART Second Line Adult
 
9624
Other values (8)
 
7009

Length

Max length46
Median length20
Mean length21.3161523
Min length4

Characters and Unicode

Total characters14802128
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowART First Line Adult
2nd rowART First Line Adult
3rd rowART First Line Adult
4th rowART First Line Adult
5th rowART First Line Adult

Common Values

ValueCountFrequency (%)
ART First Line Adult598437
86.2%
Cotrimoxazole (CTX) Prophylaxis47727
 
6.9%
Isoniazid Preventive Therapy (IPT)18706
 
2.7%
ART First Line Children12906
 
1.9%
ART Second Line Adult9624
 
1.4%
ARV Prophylaxis for Pregnant Women2402
 
0.3%
Other anti-infectives (including STI Medicine)1926
 
0.3%
Other Medicines1217
 
0.2%
ART Second Line Children1089
 
0.2%
OI Treatment233
 
< 0.1%
Other values (3)142
 
< 0.1%

Length

2021-06-15T09:03:13.567536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
line622192
22.8%
art622056
22.8%
first611343
22.4%
adult608066
22.3%
prophylaxis50129
 
1.8%
cotrimoxazole47727
 
1.7%
ctx47727
 
1.7%
preventive18706
 
0.7%
ipt18706
 
0.7%
isoniazid18706
 
0.7%
Other values (18)65699
 
2.4%

Most occurring characters

ValueCountFrequency (%)
2036648
13.8%
i1417556
 
9.6%
t1295715
 
8.8%
A1232524
 
8.3%
e788012
 
5.3%
r768928
 
5.2%
l721843
 
4.9%
T709500
 
4.8%
n702603
 
4.7%
s683321
 
4.6%
Other values (31)4445478
30.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter8516159
57.5%
Uppercase Letter4110677
27.8%
Space Separator2036648
 
13.8%
Open Punctuation68359
 
0.5%
Close Punctuation68359
 
0.5%
Dash Punctuation1926
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i1417556
16.6%
t1295715
15.2%
e788012
9.3%
r768928
9.0%
l721843
8.5%
n702603
8.3%
s683321
8.0%
d656685
7.7%
u609992
7.2%
o227533
 
2.7%
Other values (11)643971
7.6%
Uppercase Letter
ValueCountFrequency (%)
A1232524
30.0%
T709500
17.3%
R624458
15.2%
L622192
15.1%
F611343
14.9%
C109449
 
2.7%
P89945
 
2.2%
X47727
 
1.2%
I39571
 
1.0%
S12639
 
0.3%
Other values (6)11329
 
0.3%
Space Separator
ValueCountFrequency (%)
2036648
100.0%
Open Punctuation
ValueCountFrequency (%)
(68359
100.0%
Close Punctuation
ValueCountFrequency (%)
)68359
100.0%
Dash Punctuation
ValueCountFrequency (%)
-1926
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin12626836
85.3%
Common2175292
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i1417556
 
11.2%
t1295715
 
10.3%
A1232524
 
9.8%
e788012
 
6.2%
r768928
 
6.1%
l721843
 
5.7%
T709500
 
5.6%
n702603
 
5.6%
s683321
 
5.4%
d656685
 
5.2%
Other values (27)3650149
28.9%
Common
ValueCountFrequency (%)
2036648
93.6%
(68359
 
3.1%
)68359
 
3.1%
-1926
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII14802128
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2036648
13.8%
i1417556
 
9.6%
t1295715
 
8.8%
A1232524
 
8.3%
e788012
 
5.3%
r768928
 
5.2%
l721843
 
4.9%
T709500
 
4.8%
n702603
 
4.7%
s683321
 
4.6%
Other values (31)4445478
30.0%

Regimen
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct90
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
TDF(300mg)+3TC(300mg)+DTG(50mg)
206616 
AZT(300mg)+3TC(150mg)+ABC(300mg)
170496 
TDF(300mg)+3TC(300mg)+LPV/r(200/50mg)
122259 
Cotrimoxazole 960mg
46009 
TDF(300mg)+3TC(300mg)+EFV(600mg)
26837 
Other values (85)
122192 

Length

Max length62
Median length32
Mean length30.99477109
Min length10

Characters and Unicode

Total characters21523048
Distinct characters56
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowTDF(300mg)+3TC(300mg)+DTG(50mg)
2nd rowTDF(300mg)+3TC(300mg)+DTG(50mg)
3rd rowTDF(300mg)+3TC(300mg)+DTG(50mg)
4th rowTDF(300mg)+3TC(300mg)+DTG(50mg)
5th rowTDF(300mg)+3TC(300mg)+DTG(50mg)

Common Values

ValueCountFrequency (%)
TDF(300mg)+3TC(300mg)+DTG(50mg)206616
29.8%
AZT(300mg)+3TC(150mg)+ABC(300mg)170496
24.6%
TDF(300mg)+3TC(300mg)+LPV/r(200/50mg)122259
17.6%
Cotrimoxazole 960mg46009
 
6.6%
TDF(300mg)+3TC(300mg)+EFV(600mg)26837
 
3.9%
TDF/FTC(300/200mg)+NVP(200mg)21423
 
3.1%
AZT(300mg)+3TC(150mg)+NVP(200mg)19513
 
2.8%
Isoniazid 300mg17964
 
2.6%
TDF/FTC(300/200mg)+EFV(600mg)16308
 
2.3%
AZT(300mg)+3TC(150mg)+EFV(600mg)8994
 
1.3%
Other values (80)37990
 
5.5%

Length

2021-06-15T09:03:13.945319image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tdf(300mg)+3tc(300mg)+dtg(50mg206616
27.1%
azt(300mg)+3tc(150mg)+abc(300mg170496
22.3%
tdf(300mg)+3tc(300mg)+lpv/r(200/50mg122259
16.0%
cotrimoxazole47809
 
6.3%
960mg46009
 
6.0%
tdf(300mg)+3tc(300mg)+efv(600mg26837
 
3.5%
tdf/ftc(300/200mg)+nvp(200mg21423
 
2.8%
azt(300mg)+3tc(150mg)+nvp(200mg19513
 
2.6%
isoniazid18642
 
2.4%
300mg17964
 
2.4%
Other values (96)65704
 
8.6%

Most occurring characters

ValueCountFrequency (%)
03520062
16.4%
m1951692
9.1%
g1895039
8.8%
(1825826
 
8.5%
)1825826
 
8.5%
31764673
 
8.2%
T1456622
 
6.8%
+1201249
 
5.6%
C849591
 
3.9%
D616749
 
2.9%
Other values (46)4615719
21.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6442623
29.9%
Uppercase Letter5090070
23.6%
Lowercase Letter4700825
21.8%
Open Punctuation1825826
 
8.5%
Close Punctuation1825826
 
8.5%
Math Symbol1201249
 
5.6%
Other Punctuation367766
 
1.7%
Space Separator68863
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m1951692
41.5%
g1895039
40.3%
r183812
 
3.9%
o164731
 
3.5%
i90638
 
1.9%
a69908
 
1.5%
z66614
 
1.4%
l58804
 
1.3%
t50275
 
1.1%
x49795
 
1.1%
Other values (12)119517
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
T1456622
28.6%
C849591
16.7%
D616749
12.1%
F500755
 
9.8%
A391432
 
7.7%
V241030
 
4.7%
G210511
 
4.1%
Z208073
 
4.1%
P179810
 
3.5%
B177920
 
3.5%
Other values (8)257577
 
5.1%
Decimal Number
ValueCountFrequency (%)
03520062
54.6%
31764673
27.4%
5548098
 
8.5%
1227030
 
3.5%
2220329
 
3.4%
6105280
 
1.6%
946009
 
0.7%
49584
 
0.1%
81461
 
< 0.1%
797
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/367760
> 99.9%
,6
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
(1825826
100.0%
Close Punctuation
ValueCountFrequency (%)
)1825826
100.0%
Math Symbol
ValueCountFrequency (%)
+1201249
100.0%
Space Separator
ValueCountFrequency (%)
68863
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11732153
54.5%
Latin9790895
45.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
m1951692
19.9%
g1895039
19.4%
T1456622
14.9%
C849591
8.7%
D616749
 
6.3%
F500755
 
5.1%
A391432
 
4.0%
V241030
 
2.5%
G210511
 
2.2%
Z208073
 
2.1%
Other values (30)1469401
15.0%
Common
ValueCountFrequency (%)
03520062
30.0%
(1825826
15.6%
)1825826
15.6%
31764673
15.0%
+1201249
 
10.2%
5548098
 
4.7%
/367760
 
3.1%
1227030
 
1.9%
2220329
 
1.9%
6105280
 
0.9%
Other values (6)126020
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII21523048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03520062
16.4%
m1951692
9.1%
g1895039
8.8%
(1825826
 
8.5%
)1825826
 
8.5%
31764673
 
8.2%
T1456622
 
6.8%
+1201249
 
5.6%
C849591
 
3.9%
D616749
 
2.9%
Other values (46)4615719
21.4%

PHARMACY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct694409
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1831938.885
Minimum30577
Maximum4082396
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:14.114697image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum30577
5-th percentile78221.4
Q1253799
median2026173
Q33248503
95-th percentile3806789.6
Maximum4082396
Range4051819
Interquartile range (IQR)2994704

Descriptive statistics

Standard deviation1409940.293
Coefficient of variation (CV)0.7696437388
Kurtosis-1.622628204
Mean1831938.885
Median Absolute Deviation (MAD)1494246
Skewness-0.05367781955
Sum1.272114849 × 1012
Variance1.987931629 × 1012
MonotonicityStrictly increasing
2021-06-15T09:03:14.294039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34550561
 
< 0.1%
5093571
 
< 0.1%
4950121
 
< 0.1%
25901171
 
< 0.1%
25962621
 
< 0.1%
25942151
 
< 0.1%
23842731
 
< 0.1%
26146971
 
< 0.1%
26187951
 
< 0.1%
26085561
 
< 0.1%
Other values (694399)694399
> 99.9%
ValueCountFrequency (%)
305771
< 0.1%
305781
< 0.1%
305791
< 0.1%
305801
< 0.1%
305811
< 0.1%
305821
< 0.1%
305831
< 0.1%
305841
< 0.1%
305851
< 0.1%
305861
< 0.1%
ValueCountFrequency (%)
40823961
< 0.1%
40823951
< 0.1%
40823941
< 0.1%
40823931
< 0.1%
40823891
< 0.1%
40823881
< 0.1%
40823871
< 0.1%
40821771
< 0.1%
40821731
< 0.1%
40821701
< 0.1%

PATIENT_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct19002
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63993.45849
Minimum8217
Maximum160854
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:14.501141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum8217
5-th percentile9304
Q113232
median19731
Q3111670
95-th percentile144970
Maximum160854
Range152637
Interquartile range (IQR)98438

Descriptive statistics

Standard deviation53895.07033
Coefficient of variation (CV)0.8421965558
Kurtosis-1.659011182
Mean63993.45849
Median Absolute Deviation (MAD)11105
Skewness0.2691505114
Sum4.443763352 × 1010
Variance2904678606
MonotonicityNot monotonic
2021-06-15T09:03:14.696524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
143852372
 
0.1%
143759360
 
0.1%
145143344
 
< 0.1%
144092321
 
< 0.1%
145763320
 
< 0.1%
144641314
 
< 0.1%
145451304
 
< 0.1%
143911293
 
< 0.1%
144749290
 
< 0.1%
143741285
 
< 0.1%
Other values (18992)691206
99.5%
ValueCountFrequency (%)
82177
 
< 0.1%
821836
< 0.1%
821925
< 0.1%
822023
< 0.1%
822142
< 0.1%
822215
 
< 0.1%
822342
< 0.1%
82247
 
< 0.1%
822520
< 0.1%
82263
 
< 0.1%
ValueCountFrequency (%)
1608543
< 0.1%
1608494
< 0.1%
1608375
< 0.1%
1608365
< 0.1%
1608354
< 0.1%
1608333
< 0.1%
1607595
< 0.1%
1607175
< 0.1%
1607154
< 0.1%
1607145
< 0.1%

FACILITY_ID
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9052.747443
Minimum3005
Maximum10026
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:14.884289image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum3005
5-th percentile3005
Q110013
median10017
Q310023
95-th percentile10025
Maximum10026
Range7021
Interquartile range (IQR)10

Descriptive statistics

Standard deviation2417.144705
Coefficient of variation (CV)0.2670067535
Kurtosis2.41941982
Mean9052.747443
Median Absolute Deviation (MAD)5
Skewness-2.102234411
Sum6286309299
Variance5842588.523
MonotonicityNot monotonic
2021-06-15T09:03:15.039844image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
10023110430
15.9%
300587583
12.6%
1001385625
12.3%
1001671906
10.4%
1002269864
10.1%
1001455176
7.9%
1002549129
7.1%
1001743269
 
6.2%
1002438560
 
5.6%
1001833081
 
4.8%
Other values (10)49786
7.2%
ValueCountFrequency (%)
300587583
12.6%
30072540
 
0.4%
30085530
 
0.8%
100101365
 
0.2%
100111441
 
0.2%
100124767
 
0.7%
1001385625
12.3%
1001455176
7.9%
1001514961
 
2.2%
1001671906
10.4%
ValueCountFrequency (%)
100261413
 
0.2%
1002549129
7.1%
1002438560
 
5.6%
10023110430
15.9%
1002269864
10.1%
100212638
 
0.4%
100201940
 
0.3%
1001913191
 
1.9%
1001833081
 
4.8%
1001743269
 
6.2%

DATE_VISIT
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size5.3 MiB

DURATION
Real number (ℝ≥0)

SKEWED

Distinct93
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.33866209
Minimum0
Maximum18090
Zeros101
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:15.225171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile30
Q160
median60
Q390
95-th percentile180
Maximum18090
Range18090
Interquartile range (IQR)30

Descriptive statistics

Standard deviation51.11679274
Coefficient of variation (CV)0.6525104128
Kurtosis22228.308
Mean78.33866209
Median Absolute Deviation (MAD)30
Skewness65.07331267
Sum54399072
Variance2612.9265
MonotonicityNot monotonic
2021-06-15T09:03:15.413364image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60322857
46.5%
90157517
22.7%
18088938
 
12.8%
3082785
 
11.9%
1418881
 
2.7%
12010904
 
1.6%
493828
 
0.6%
153758
 
0.5%
1681048
 
0.2%
71016
 
0.1%
Other values (83)2877
 
0.4%
ValueCountFrequency (%)
0101
 
< 0.1%
18
 
< 0.1%
23
 
< 0.1%
37
 
< 0.1%
44
 
< 0.1%
53
 
< 0.1%
62
 
< 0.1%
71016
0.1%
96
 
< 0.1%
1010
 
< 0.1%
ValueCountFrequency (%)
180901
 
< 0.1%
18201
 
< 0.1%
18008
 
< 0.1%
12101
 
< 0.1%
11681
 
< 0.1%
96089
< 0.1%
9001
 
< 0.1%
8101
 
< 0.1%
7201
 
< 0.1%
60012
 
< 0.1%

MORNING
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.219905416
Minimum0
Maximum960
Zeros561743
Zeros (%)80.9%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:15.832242image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum960
Range960
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.995967749
Coefficient of variation (CV)9.076482907
Kurtosis94636.22191
Mean0.219905416
Median Absolute Deviation (MAD)0
Skewness234.8112474
Sum152704.3
Variance3.983887256
MonotonicityNot monotonic
2021-06-15T09:03:15.993536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
0561743
80.9%
1128884
 
18.6%
21861
 
0.3%
31771
 
0.3%
90127
 
< 0.1%
1807
 
< 0.1%
1203
 
< 0.1%
0.13
 
< 0.1%
152
 
< 0.1%
1.061
 
< 0.1%
Other values (7)7
 
< 0.1%
ValueCountFrequency (%)
0561743
80.9%
0.13
 
< 0.1%
1128884
 
18.6%
1.051
 
< 0.1%
1.061
 
< 0.1%
1.091
 
< 0.1%
1.81
 
< 0.1%
21861
 
0.3%
31771
 
0.3%
152
 
< 0.1%
ValueCountFrequency (%)
9601
 
< 0.1%
6501
 
< 0.1%
1807
 
< 0.1%
1203
 
< 0.1%
90127
 
< 0.1%
601
 
< 0.1%
301
 
< 0.1%
152
 
< 0.1%
31771
0.3%
21861
0.3%

AFTERNOON
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
0
694409 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters694409
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0694409
100.0%

Length

2021-06-15T09:03:16.319133image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:03:16.448301image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0694409
100.0%

Most occurring characters

ValueCountFrequency (%)
0694409
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number694409
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0694409
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common694409
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0694409
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII694409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0694409
100.0%

EVENING
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2173344528
Minimum0
Maximum90
Zeros547892
Zeros (%)78.9%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:16.516823image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum90
Range90
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4488841288
Coefficient of variation (CV)2.065407132
Kurtosis2363.535512
Mean0.2173344528
Median Absolute Deviation (MAD)0
Skewness14.04583624
Sum150919
Variance0.201496961
MonotonicityNot monotonic
2021-06-15T09:03:16.626170image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0547892
78.9%
1144030
 
20.7%
31771
 
0.3%
2713
 
0.1%
302
 
< 0.1%
901
 
< 0.1%
ValueCountFrequency (%)
0547892
78.9%
1144030
 
20.7%
2713
 
0.1%
31771
 
0.3%
302
 
< 0.1%
901
 
< 0.1%
ValueCountFrequency (%)
901
 
< 0.1%
302
 
< 0.1%
31771
 
0.3%
2713
 
0.1%
1144030
 
20.7%
0547892
78.9%

ADR_SCREENED
Boolean

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing502740
Missing (%)72.4%
Memory size1.3 MiB
False
191486 
True
 
183
(Missing)
502740 
ValueCountFrequency (%)
False191486
 
27.6%
True183
 
< 0.1%
(Missing)502740
72.4%
2021-06-15T09:03:16.744832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

ADR_IDS
Categorical

CONSTANT
HIGH CORRELATION
MISSING
REJECTED

Distinct1
Distinct (%)33.3%
Missing694406
Missing (%)> 99.9%
Memory size5.3 MiB
6,1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters9
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row6,1
2nd row6,1
3rd row6,1

Common Values

ValueCountFrequency (%)
6,13
 
< 0.1%
(Missing)694406
> 99.9%

Length

2021-06-15T09:03:16.978786image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:03:17.089680image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
6,13
100.0%

Most occurring characters

ValueCountFrequency (%)
63
33.3%
,3
33.3%
13
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number6
66.7%
Other Punctuation3
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
63
50.0%
13
50.0%
Other Punctuation
ValueCountFrequency (%)
,3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common9
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
63
33.3%
,3
33.3%
13
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63
33.3%
,3
33.3%
13
33.3%

PRESCRIP_ERROR
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
0
693706 
1
 
703

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters694409
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

Length

2021-06-15T09:03:17.338523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:03:17.430893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number694409
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common694409
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII694409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0693706
99.9%
1703
 
0.1%

ADHERENCE
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.3 MiB
0
624630 
1
69779 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters694409
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

Length

2021-06-15T09:03:17.637404image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-15T09:03:17.758865image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

Most occurring characters

ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number694409
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Common694409
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII694409
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0624630
90.0%
169779
 
10.0%

NEXT_APPOINTMENT
Unsupported

REJECTED
UNSUPPORTED

Missing24
Missing (%)< 0.1%
Memory size5.3 MiB

DMOC_TYPE
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct13
Distinct (%)< 0.1%
Missing515067
Missing (%)74.2%
Memory size5.3 MiB
Same Facility Refill
95453 
MMD
67085 
Individual delivery/home-based
13633 
MMS
 
2906
Different Facility Refill (Private hospital/clinic)
 
167
Other values (8)
 
98

Length

Max length51
Median length20
Mean length14.15255768
Min length3

Characters and Unicode

Total characters2538148
Distinct characters40
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMMD
2nd rowMMD
3rd rowMMD
4th rowMMD
5th rowMMD

Common Values

ValueCountFrequency (%)
Same Facility Refill95453
 
13.7%
MMD67085
 
9.7%
Individual delivery/home-based13633
 
2.0%
MMS2906
 
0.4%
Different Facility Refill (Private hospital/clinic)167
 
< 0.1%
Other42
 
< 0.1%
Fixed or ad hoc pick up points27
 
< 0.1%
Mobile van/other vehicle10
 
< 0.1%
PMVs/Chemists4
 
< 0.1%
CPARP4
 
< 0.1%
Other values (3)11
 
< 0.1%
(Missing)515067
74.2%

Length

2021-06-15T09:03:17.965425image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
facility95620
24.9%
refill95620
24.9%
same95453
24.8%
mmd67085
17.4%
individual13633
 
3.5%
delivery/home-based13633
 
3.5%
mms2906
 
0.8%
private167
 
< 0.1%
different167
 
< 0.1%
hospital/clinic167
 
< 0.1%
Other values (21)300
 
0.1%

Most occurring characters

ValueCountFrequency (%)
i328711
13.0%
l314488
12.4%
e246239
 
9.7%
a218710
 
8.6%
205409
 
8.1%
M139996
 
5.5%
y109257
 
4.3%
m109090
 
4.3%
S98359
 
3.9%
t96212
 
3.8%
Other values (30)671677
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1794110
70.7%
Uppercase Letter510820
 
20.1%
Space Separator205409
 
8.1%
Other Punctuation13830
 
0.5%
Dash Punctuation13637
 
0.5%
Open Punctuation171
 
< 0.1%
Close Punctuation171
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i328711
18.3%
l314488
17.5%
e246239
13.7%
a218710
12.2%
y109257
 
6.1%
m109090
 
6.1%
t96212
 
5.4%
c96022
 
5.4%
f95954
 
5.3%
d54594
 
3.0%
Other values (11)124833
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
M139996
27.4%
S98359
19.3%
F95647
18.7%
R95631
18.7%
D67260
13.2%
I13633
 
2.7%
P187
 
< 0.1%
O42
 
< 0.1%
C30
 
< 0.1%
A15
 
< 0.1%
Other values (3)20
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/13822
99.9%
,8
 
0.1%
Space Separator
ValueCountFrequency (%)
205409
100.0%
Dash Punctuation
ValueCountFrequency (%)
-13637
100.0%
Open Punctuation
ValueCountFrequency (%)
(171
100.0%
Close Punctuation
ValueCountFrequency (%)
)171
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2304930
90.8%
Common233218
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
i328711
14.3%
l314488
13.6%
e246239
10.7%
a218710
 
9.5%
M139996
 
6.1%
y109257
 
4.7%
m109090
 
4.7%
S98359
 
4.3%
t96212
 
4.2%
c96022
 
4.2%
Other values (24)547846
23.8%
Common
ValueCountFrequency (%)
205409
88.1%
/13822
 
5.9%
-13637
 
5.8%
(171
 
0.1%
)171
 
0.1%
,8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2538148
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i328711
13.0%
l314488
12.4%
e246239
 
9.7%
a218710
 
8.6%
205409
 
8.1%
M139996
 
5.5%
y109257
 
4.3%
m109090
 
4.3%
S98359
 
3.9%
t96212
 
3.8%
Other values (30)671677
26.5%

BODY_WEIGHT
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct66
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.07961143937
Minimum0
Maximum67
Zeros691674
Zeros (%)99.6%
Negative0
Negative (%)0.0%
Memory size5.3 MiB
2021-06-15T09:03:18.111664image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum67
Range67
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.394571974
Coefficient of variation (CV)17.51723101
Kurtosis517.9834294
Mean0.07961143937
Median Absolute Deviation (MAD)0
Skewness20.88914214
Sum55282.9
Variance1.944830991
MonotonicityNot monotonic
2021-06-15T09:03:18.251769image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0691674
99.6%
30187
 
< 0.1%
15161
 
< 0.1%
25161
 
< 0.1%
20154
 
< 0.1%
18123
 
< 0.1%
10117
 
< 0.1%
12115
 
< 0.1%
22111
 
< 0.1%
8104
 
< 0.1%
Other values (56)1502
 
0.2%
ValueCountFrequency (%)
0691674
99.6%
14
 
< 0.1%
311
 
< 0.1%
44
 
< 0.1%
518
 
< 0.1%
619
 
< 0.1%
6.84
 
< 0.1%
727
 
< 0.1%
7.54
 
< 0.1%
8104
 
< 0.1%
ValueCountFrequency (%)
674
 
< 0.1%
664
 
< 0.1%
654
 
< 0.1%
574
 
< 0.1%
566
 
< 0.1%
5018
< 0.1%
4911
< 0.1%
4812
< 0.1%
468
< 0.1%
442
 
< 0.1%

Interactions

2021-06-15T09:02:53.602924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:53.882544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:54.143025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:54.397922image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:54.664208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:54.916348image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:55.167905image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:55.425736image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:55.683423image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:55.940468image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:56.195535image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:56.450399image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:56.695319image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:56.936415image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:57.184381image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:57.443640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:57.694022image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:57.946537image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:58.199383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:58.440014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:58.913016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:59.157977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:59.421409image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:59.674081image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:02:59.921581image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:00.180189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:00.418467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:00.664735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:00.940807image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:01.199692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:01.446868image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:01.694055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:01.930100image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:02.173413image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:02.411930image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:02.648564image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:02.892252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:03.124981image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:03.358345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:03.594811image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:03.818724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:04.047278image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:04.286524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:04.550398image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:04.804141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:05.054460image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:05.302295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:05.547910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-15T09:03:05.782885image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-15T09:03:18.431465image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-15T09:03:18.629274image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-15T09:03:18.825953image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-15T09:03:19.033354image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-06-15T09:03:19.255702image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-06-15T09:03:06.500963image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-15T09:03:08.177848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-15T09:03:10.829668image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-15T09:03:11.567563image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
0NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305778232100112020-03-16 00:00:00900.001NaNNaN002020-07-16 00:00:00MMD0.0
1NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305788219100112020-03-02 00:00:00301.000NaNNaN002020-04-02 00:00:00NaN0.0
2NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305798246100112020-02-26 00:00:00900.001NaNNaN002020-05-17 00:00:00MMD0.0
3NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305808217100112020-06-06 00:00:00900.001NaNNaN002020-09-04 00:00:00MMD0.0
4NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305818243100112020-04-18 00:00:00900.001NaNNaN002020-07-16 00:00:00MMD0.0
5NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305828256100112020-03-02 00:00:00300.001NaNNaN002020-04-02 00:00:00NaN0.0
6NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305838246100112020-02-11 00:00:00900.000NaNNaN002020-05-11 00:00:00NaN0.0
7NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+LPV/r(200/50mg)305848240100112017-05-27 00:00:00300.000NaNNaN002017-06-26 00:00:00NaN0.0
8NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305858230100112020-05-08 00:00:00900.001NaNNaN002020-08-05 00:00:00MMD0.0
9NigerMagamaRural Hosp- AunaART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)305868223100112020-02-26 00:00:00300.001NaNNaN002020-03-26 00:00:00NaN0.0

Last rows

StateL.G.AFacility NameRegimen LineRegimenPHARMACY_IDPATIENT_IDFACILITY_IDDATE_VISITDURATIONMORNINGAFTERNOONEVENINGADR_SCREENEDADR_IDSPRESCRIP_ERRORADHERENCENEXT_APPOINTMENTDMOC_TYPEBODY_WEIGHT
694399NigerKontagoraGeneral Hospital KontagoraCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg408217016083630052021-05-28 00:00:00901.000NoNaN002021-08-25 00:00:00Same Facility Refill0.0
694400NigerKontagoraGeneral Hospital KontagoraIsoniazid Preventive Therapy (IPT)Isoniazid 300mg408217314487730052020-08-28 00:00:00151.000NoNaN002020-09-10 00:00:00Same Facility Refill0.0
694401NigerKontagoraGeneral Hospital KontagoraART Second Line AdultTDF(300mg)+3TC(150mg)+ATV/r(300/100mg)408217714522130052021-05-27 00:00:00901.000NoNaN002021-09-06 00:00:00Same Facility Refill0.0
694402NigerBorguWawa BHCART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082387160854100212021-05-30 00:00:001800.001NoNaN002021-10-30 00:00:00Same Facility Refill0.0
694403NigerBorguWawa BHCART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082388160854100212021-05-30 00:00:001800.001NoNaN002021-10-30 00:00:00Same Facility Refill0.0
694404NigerBorguWawa BHCART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082389160854100212021-05-30 00:00:001801.000NoNaN002021-10-30 00:00:00Same Facility Refill0.0
694405NigerLapaiGeneral Hospital - LapaiART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082393131562100162021-05-06 00:00:001801.000NoNaN002021-10-21 00:00:00Same Facility Refill0.0
694406NigerLapaiGeneral Hospital - LapaiART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082394131562100162021-05-06 00:00:001800.001NoNaN002021-10-21 00:00:00Same Facility Refill0.0
694407NigerLapaiGeneral Hospital - LapaiCotrimoxazole (CTX) ProphylaxisCotrimoxazole 960mg4082395131562100162021-05-06 00:00:001801.000NoNaN002021-10-21 00:00:00Same Facility Refill0.0
694408NigerLapaiGeneral Hospital - LapaiART First Line AdultTDF(300mg)+3TC(300mg)+DTG(50mg)4082396131562100162021-05-06 00:00:001800.001NoNaN002021-10-21 00:00:00Same Facility Refill0.0